68 research outputs found
Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences
We present a new method for nonlinear prediction of discrete random sequences
under minimal structural assumptions. We give a mathematical construction for
optimal predictors of such processes, in the form of hidden Markov models. We
then describe an algorithm, CSSR (Causal-State Splitting Reconstruction), which
approximates the ideal predictor from data. We discuss the reliability of CSSR,
its data requirements, and its performance in simulations. Finally, we compare
our approach to existing methods using variable-length Markov models and
cross-validated hidden Markov models, and show theoretically and experimentally
that our method delivers results superior to the former and at least comparable
to the latter.Comment: 8 pages, 4 figure
What Is a Macrostate? Subjective Observations and Objective Dynamics
We consider the question of whether thermodynamic macrostates are objective
consequences of dynamics, or subjective reflections of our ignorance of a
physical system. We argue that they are both; more specifically, that the set
of macrostates forms the unique maximal partition of phase space which 1) is
consistent with our observations (a subjective fact about our ability to
observe the system) and 2) obeys a Markov process (an objective fact about the
system's dynamics). We review the ideas of computational mechanics, an
information-theoretic method for finding optimal causal models of stochastic
processes, and argue that macrostates coincide with the ``causal states'' of
computational mechanics. Defining a set of macrostates thus consists of an
inductive process where we start with a given set of observables, and then
refine our partition of phase space until we reach a set of states which
predict their own future, i.e. which are Markovian. Macrostates arrived at in
this way are provably optimal statistical predictors of the future values of
our observables.Comment: 15 pages, no figure
Predictive PAC Learning and Process Decompositions
We informally call a stochastic process learnable if it admits a
generalization error approaching zero in probability for any concept class with
finite VC-dimension (IID processes are the simplest example). A mixture of
learnable processes need not be learnable itself, and certainly its
generalization error need not decay at the same rate. In this paper, we argue
that it is natural in predictive PAC to condition not on the past observations
but on the mixture component of the sample path. This definition not only
matches what a realistic learner might demand, but also allows us to sidestep
several otherwise grave problems in learning from dependent data. In
particular, we give a novel PAC generalization bound for mixtures of learnable
processes with a generalization error that is not worse than that of each
mixture component. We also provide a characterization of mixtures of absolutely
regular (-mixing) processes, of independent probability-theoretic
interest.Comment: 9 pages, accepted in NIPS 201
Consistency of Maximum Likelihood for Continuous-Space Network Models
Network analysis needs tools to infer distributions over graphs of arbitrary
size from a single graph. Assuming the distribution is generated by a
continuous latent space model which obeys certain natural symmetry and
smoothness properties, we establish three levels of consistency for
non-parametric maximum likelihood inference as the number of nodes grows: (i)
the estimated locations of all nodes converge in probability on their true
locations; (ii) the distribution over locations in the latent space converges
on the true distribution; and (iii) the distribution over graphs of arbitrary
size converges.Comment: 21 page
- …